Web Page Classification Based on Uncorrelated Semi-Supervised Intra-View and Inter-View Manifold Discriminant Feature Extraction

نویسندگان

  • Xiao-Yuan Jing
  • Qian Liu
  • Fei Wu
  • Baowen Xu
  • Yang-Ping Zhu
  • Songcan Chen
چکیده

Web page classification has attracted increasing research interest. It is intrinsically a multi-view and semi-supervised application, since web pages usually contain two or more types of data, such as text, hyperlinks and images, and unlabeled pages are generally much more than labeled ones. Web page data is commonly high-dimensional. Thus, how to extract useful features from this kind of data in the multi-view semi-supervised scenario is important for web page classification. To our knowledge, only one method is specially presented for this topic. And with respect to a few semisupervised multi-view feature extraction methods on other applications, there still exists much room for improvement. In this paper, we firstly design a feature extraction schema called semi-supervised intra-view and inter-view manifold discriminant (SIMD) learning, which sufficiently utilizes the intra-view and inter-view discriminant information of labeled samples and the local neighborhood structures of unlabeled samples. We then design a semi-supervised uncorrelation constraint for the SIMD schema to remove the multi-view correlation in the semi-supervised scenario. By combining the SIMD schema with the constraint, we propose an uncorrelated semi-supervised intra-view and inter-view manifold discriminant (USIMD) learning approach for web page classification. Experiments on public web page databases validate the proposed approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Intra-View and Inter-View Supervised Correlation Analysis for Multi-View Feature Learning

Multi-view feature learning is an attractive research topic with great practical success. Canonical correlation analysis (CCA) has become an important technique in multi-view learning, since it can fully utilize the inter-view correlation. In this paper, we mainly study the CCA based multi-view supervised feature learning technique where the labels of training samples are known. Several supervi...

متن کامل

Supervised Feature Extraction of Face Images for Improvement of Recognition Accuracy

Dimensionality reduction methods transform or select a low dimensional feature space to efficiently represent the original high dimensional feature space of data. Feature reduction techniques are an important step in many pattern recognition problems in different fields especially in analyzing of high dimensional data. Hyperspectral images are acquired by remote sensors and human face images ar...

متن کامل

Fisher Discriminant Analysis (FDA), a supervised feature reduction method in seismic object detection

Automatic processes on seismic data using pattern recognition is one of the interesting fields in geophysical data interpretation. One part is the seismic object detection using different supervised classification methods that finally has an output as a probability cube. Object detection process starts with generating a pickset of two classes labeled as object and non-object and then selecting ...

متن کامل

Feature reduction of hyperspectral images: Discriminant analysis and the first principal component

When the number of training samples is limited, feature reduction plays an important role in classification of hyperspectral images. In this paper, we propose a supervised feature extraction method based on discriminant analysis (DA) which uses the first principal component (PC1) to weight the scatter matrices. The proposed method, called DA-PC1, copes with the small sample size problem and has...

متن کامل

Semi-Supervised Based Hyperspectral Imagery Classification

Hyperspectral imagery classification is a challenging problem. Wherein, the high number of spectral channels and the high cost of true sample labeling greatly reduce the classification precision. In this paper, we proposed a semi-supervised method, which combine linear discriminant analysis and manifold learning, to improve the precision of hyperspectral imagery classification. Experimental res...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015